Personalised, Collaborative Spam Filtering

نویسندگان

  • Alan Gray
  • Mads Haahr
چکیده

The state of the art sees content-based filters tending towards collaborative filters, whereby email is filtered at the MTA with users feeding information back about false positives and negatives. While this improves the ability of the filter to track concept drift in spam over time, such approaches make assumptions implicit in centralised spam filtering, such as that all users consider the same email to be spam. In this paper, we detail and analyse these assumptions and describe how they affect spam filtering. We present an architecture for personalised, collaborative spam filtering and describe the design and implementation of proof-of-concept, peer-to-peer, signature-based system based on the architecture. The evaluation is based on real-world users employing the system as their spam-filtering tool. Preliminary analysis of the results indicates that the implementation is accurate and efficient.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Symbiotic Spam E-mail Filtering

This position paper discusses the use of symbiotic filtering, a novel distributed data mining approach that combines contentbased and collaborative filtering for spam detection.

متن کامل

Collaborative Blog Spam Filtering Using Adaptive Percolation Search

We propose a novel collaborative filtering method for link spams on blogs. The key idea is to rely on manual identification of spams and share this information about spams through a network of trust. The blogger who has identified a spam tells a small number of fellow bloggers (content implantation), and those who have not heard about it start a search using an adaptive percolation search, comb...

متن کامل

A Survey of Content-based Spam Classifiers

Unsolicited bulk e-mail (spam) is a growing problem with tangible costs felt by virtually every Internet user. There are many solutions to this problem, ranging from simple blacklisting to advanced text classification and collaborative filtering. None of these techniques provides a total solution, but new technologies and their application offer increasingly effective filters. This paper provid...

متن کامل

What Happened to Content-Based Information Filtering?

Personalisation can have a significant impact on the way information is disseminated on the web today. Information Filtering can be a significant ingredient towards a personalised web. Collaborative Filtering is already being applied successfully for generating personalised recommendations of music tracks, books, movies and more. The same is not true for Content-Based Filtering. In this paper, ...

متن کامل

A Case-Based Approach to Spam Filtering that Can Track Concept Drift

There are a few key benefits of a case-based approach to spam filtering. First, the many different sub-types of spam suggest that a local learner, such as Case-Based Reasoning (CBR) will perform well. Second, the lazy approach to learning in CBR allows for easy updating as new types of spam arrive. Third, the case-based approach to spam filtering allows for the sharing of cases and thus a shari...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004